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DETAILED ACTION 

Claim Rejections - 35 USC §112 

The following is a quotation of the first paragraph of 35 U.S.C. 112: 

The specification shall contain a written description of the invention, and of the manner and process of 
making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the 
art to which it pertains, or with which it is most nearly connected, to make and use the same and shall 
set forth the best mode contemplated by the inventor of carrying out his invention. 

Claims 1 to 8, 1 1 to 18, 46 to 53, 56 to 66, and 69 to 71 are rejected under 35 
U.S.C. 1 12, first paragraph, as failing to connply with the written description requirement. 
The claims contains subject matter which was not described in the specification in such 
a way as to reasonably convey to one skilled in the relevant art that the inventors, at the 
time the application was filed, had possession of the claimed invention. 

The limitation of "randomly" receiving non-voice input constitutes new matter 
because Applicants' Specification, as originally filed, does not expressly disclose 
"randomly" receiving non-voice input, and one having ordinary skill in the art would find 
that it is misdescriptive to say that non-voice input of receiving an e-mail is "randomly" 
received. Applicants are attempting to draw an invalid distinction between their 
disclosed speech recognition system and the prior art. One having ordinary skill in the 
art would not say that an e-mail is random. Of course, nobody knows what the content 
of an e-mail not yet received will be, but that doesn't make the content of the e-mail 
random. Any e-mail is still sent in a language (e.g. English), consisting of a vocabulary 
in that language, and adheres to grammatical conventions. An e-mail is not a random 
sequence of letters and words. Applicants' Specification, as originally filed, does not 
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anywhere characterize receiving non-voice input as "random", and there is no basis 
supporting a claim limitation of "randomly" receiving input from the Specification. Thus, 
the claim limitation of "randomly" receiving non-voice input does not comply with the 
written description requirement of 35 U.S.C. §112, 1®^ % because the limitation 
constitutes new matter and is misdescriptive of Applicants' disclosed speech recognition 
system and program code. 



Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
Invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

Claims 1 to 8, 11 to 13. 15 to 17, 46 to 53, 56 to 58, 60 to 62, 64 to 66, and 69 to 
71 are rejected under 35 U.S.C. 103(a) as being unpatentable over Young at al. in view 
ofThelen et al. C551). 

Concerning independent claims 1, 46, and 64, Young etai discloses a speech 
recognition system and computer program, comprising: 

"preparing a first textual output from a speech signal by performing a speech 
recognition task to convert a speech signal into said first textual output, wherein said 
context-enhanced database is accessed to improve the speech recognition rate, 
wherein said speech signal is parsed into a plurality of computer processable speech 
segments, wherein said first textual output comprises a plurality of text segments, each 



Application/Control Number: 09/910,657 Page 4 

Art Unit: 2626 

corresponding to one of the computer processable speech segments, and wherein 
selected ones of the text segments are generated by matching a computer processable 
speech segment against an entry within the context-enhanced database, said context- 
enhanced database including a plurality of entries, each entry comprising a speech 
utterance and a corresponding textual segment for the speech utterance" - recognizer 
215 receives and processes frames ("parsed into a plurality of computer processable 
speech segments") of an utterance to identify text ("a first textual output") corresponding 
to the utterance ("a speech signal"); scores represent how well frames of an utterance 
match text hypotheses (column 4, lines 34 to 51: Figure 2); recognizer 215 processes 
frames 210 of an utterance in view of one or more constraint grammars 225 for placing 
a limitation on the order or grammatical form of the words ("a plurality of text segments") 
(column 4, lines 62: Figure 2); a constraint grammar ("a content-enhanced database") 
can include a language model for an active vocabulary or dictation topic vocabulary file 
(column 5, line 56 to column 6, line 40: Figure 2); a language model for a vocabulary file 
improves a speech recognition rate by matching entries of utterances with 
corresponding words; 

"enabling editing of said first textual output to generate a final voice-generated 
output" - a user may invoke an appropriate correction command when the system 
makes a recognition error (column 16, lines 26 to 65: Figures 13A to 13N); 

"making said final voice-generated output available" - best-scoring recognition 
candidates corresponding to dictated text are provided to an active application, such as 
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a word processor, and are displayed through a graphical user interface (column 15, 
lines 17 to 24: Figure 2). 

Concerning independent claims 1, 46, and 64, Young et al. discloses active 
vocabularies that change based upon active applications currently executing upon the 
computer system, but omits randomly receiving non-voice input in a computer system 
communicatively linked to the speech recognition system, said input comprising at least 
one of text contained in an e-mail sent or received by the user, information in a 
document attached to an e-mail sent or received by the user, information in a document 
viewed by the user on a display of the computer system, information in a plurality of 
linked documents accessible to the computer system, information in a spread sheet 
executing on the computer system, call center information received via a facsimile 
device connected to the computer system, call center information received via a calling 
device connected to the computer system, and information recorded by a web browser 
executing on the computer system, and creating a word list defining a context-enhanced 
database based upon said input or modifying an existing context-enhanced database by 
adding a word list created based upon said input. However, Thelen et al. C551) 
discloses a system for creating a vocabulary and/or language model for a speech 
recognition system from a set of documents based on a search criterion (Abstract), 
comprising: 

"randomly receiving non-voice input in a computer system communicatively 
linked to the speech recognition system, said input comprising at least one of text 
contained in an e-mail sent or received by the user, information in a document attached 
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to an e-mail sent or received by the user, information in a document viewed by the user 
on a display of the computer system, Information in a plurality of linked documents 
accessible to the computer system, information in a spread sheet executing on the 
computer system, call center information received via a facsimile device connected to 
the computer system, call center information received via a calling device connected to 
the computer system, and information recorded by a web browser executing on the 
computer system" - a vocabulary and/or language model is created by selecting 
documents from a set of documents based on a search criterion; by searching for 
documents based on a search criterion derived from a context identifier, pertinent 
documents are collected in an effective manner, increasing the quality of recognition; in 
one embodiment, the context identifier comprises one or more keywords, which acts as 
a search criterion, based on which the documents are selected; in another embodiment, 
the set of documents is formed by a document database or document file system in a 
distributed computer system; this allows for centrally storing (e.g. in a server) a larger 
set of documents than would normally be feasible to store or provide to a client 
computer; alternatively, a very large set of documents may be distributed over several 
servers, as over the Internet (column 3, line 20 to column 4, line 27; column 6, lines 1 1 
to 45); the content is derived from a distributed computer system or a set of documents 
distributed over several servers or the Internet ("information in a plurality of linked 
documents accessible to the computer system"); a set of documents distributed over 
several servers is "a plurality of linked documents"; although the received documents 
are collected based upon search criteria, the content of the collected documents are 
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received "randomly" because the content of the received documents varies considerably 
within the search parameters, is probabilistic and nondeterministic; 

"creating a word list defining a context-enhanced database based upon said input 
or modifying an existing context-enhanced database by adding a word list created 
based upon said input" - a vocabulary and/or language model is created by selecting 
documents from a set of documents based on a search criterion; by searching for 
documents based on a search criterion derived from a context identifier, pertinent 
documents are collected in an effective manner, increasing the quality of recognition; 
(column 3, line 20 to column 4, line 27; column 6, lines 1 1 to 45); a vocabulary is 
equivalent to "a word list", and the vocabulary or language model is a "context- 
enhanced database". 

Concerning independent claims 1, 46, and 64, Thelen etal. C551) teaches that 
creating a vocabulary and/or language model from a set of documents distributed over 
several servers of the Internet has an advantage of increasing the quality of recognition 
by ensuring that pertinent language elements are covered, and excluding many 
irrelevant language elements, leading to faster recognition, and creation of a relatively 
small vocabulary or language model. (Column 3, Lines 26 to 43) Thus, it is suggested 
that documents relevant for a specific category of user, such as a radiologist, a surgeon, 
or a legal practitioner, can be created. (Column 3, Lines 1 1 to 20) It would have been 
obvious to one having ordinary skill in the art to create a vocabulary and/or language 
model from randomly received information in a plurality of linked documents accessible 
to the computer system for a speech recognition system as taught by Thelen et al. 



Application/Control Number: 09/910,657 Page 8 

Art Unit: 2626 

C551) in the speech recognition and computer program of Young et al. for a purpose of 
increasing the quality of recognition by ensuring that pertinent language elements are 
covered, and excluding many irrelevant language elements, leading to faster 
recognition, and creation of a relatively small vocabulary or language model. 

Concerning claims 2, 7, 47, and 52, Young etal. discloses speech recognition for 
dictation of words of text. 

Concerning claims 3 to 5, 15, 48 to 50, 60, and 65 to 66, Young et al, discloses a 
complete dictation vocabulary consists of an active vocabulary plus a backup dictionary 
245; a system-wide backup dictionary contains all words known to the system; word 
searches of the backup vocabularies start with the user-specific backup dictionary and 
then check the system-wide backup dictionary ("before another database is searched") 
("a second database is accessed to a find a matching word ... for which no matching 
word was found"); a user may add a word to a dictation vocabulary and a user-specific 
backup vocabulary ("the context-enhanced database is created from said input and from 
entries within the second database") (column 15, line 51 to column 16, line 25). 

Concerning claims 6 and 51, Young et al. discloses that at least (c) and (d) and 
(e) are performed concurrently as recognized text is displayed during dictation and 
editing (column 15, line 13 to column 16, line 65: Figure 2). 

Concerning claims 8 and 53, Young etal. discloses speech recognition is 
performed in conjunction with a particular application (e.g., as Microsoft Word™), and 
updating the active vocabulary to include a constraint grammar associated with the 
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application and a dictation vocabulary (column 15, lines 31 to 66: Figure 2); thus, 
speech recognition is performed "in light of entries included in" a dictation vocabulary 
("said context-enhanced database"). 

Concerning claims 1 1 , 56, and 69, Thelen et al. ('551) discloses that a context 
identifier can consist of a set of keywords, or a sequence of words, which act as a 
search criterion to search for and select a training corpus for a vocabulary and/or 
language model of a speech recognition system (column 3, lines 43 to 58); a set of 
keywords for selecting documents from a larger set of documents are equivalent to "a 
word list" for "creating the context-enhanced database from those entries of a context- 
independent database", respectively. 

Concerning claims 12 to 13 and 57 to 58, Young et al. discloses displaying text 
on a graphical user interface of a word processor (column 15, lines 17 to 24: Figure 2); 
text is temporarily stored in memory 145 of a computer 125 (column 3, lines 44 to 48: 
Figure 1). 

Concerning claims 16 to 17, 61 to 62, and 70 to 71, Young et al. discloses that 
when a particular application is opened ("detecting an event") ("automatically detecting 
a change"), a new constraint grammar is activated ("automatically deriving new input"), 
and the control interface updates the active vocabulary ("responsively updating said 
context-enhanced database") (column 4, lines 62 to 67: Figure 2; column 15, lines 31 to 
38). 
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Claims 14 and 59 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Young et al. in view of Thelen et al. ('551) as applied to claims 1 and 46 above, 
and further in view of Mitchell et al. 

Young et al. does not expressly disclose the features of highlighting words 
having a predetermined likelihood of misinterpretation. However, Mitchell et al. teaches 
highlighting words on a display for which a score is less than a threshold score. 
(Column 10, Lines 12 to 18: Figure 8b: Steps S72 and S73) It is suggested that an 
advantage is a processing means that permits any application running on a processor 
that enables character data from speech recognition to be entered and manipulated. 
(Column 2, Lines 45 to 55) It would have been obvious to one having ordinary skill in 
the art to highlight words having a predetermined likelihood of misinterpretation as 
suggested by Mitchell et al. In the speech recognition system of Young et al. for the 
purpose of permitting any application running on a processor to enable speech 
recognition. 

Claims 18 and 63 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Young et al. in view of Thelen et al. ('551) as applied to claims 1 and 46 above, 
and further in view of Baker et al. 

Young et al. omits a meaning variants database and a synonym lexicon. 
However, it is known in speech recognition to utilize a thesaurus. Baker et al. teaches a 
reference source 40, which includes a dictionary and thesaurus ("meanings variants 
database" and "synonym lexicon"). (Column 15, Lines 5 to 8) It is stated that problems 
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with prior art recognition systems are avoided by performing semantic and linguistic 
analysis through language knowledge. (Column 4, Line 64 to Column 5, Line 8) It 
would have been obvious to one having ordinary skill in the art to utilize a thesaurus as 
taught by Baker et al. in the speech recognition system of Young et al. for the purpose 
of avoiding prior art problems through language knowledge. 



Response to Arguments 

Applicants' arguments filed 16 October 2006 have been fully considered but they 
are not persuasive. 

Applicants argue that the claims are patentable because the prior art does not 
disclose randomly receiving input. Applicants maintain that telephony signals, serially 
generated character strings representing words, and communications signals are 
random signals. Applicants say a signal is random, rather than deterministic, because it 
cannot be anticipated in advance of being received. Specifically, Applicants contend 
that an e-mail is random because its input cannot be deterministically anticipated in 
advance. Applicants characterize the vocabulary of Young et al. as being directed to a 
topic, and thus, cannot be random. These arguments are traversed. 

Firstly, the limitation of "randomly" receiving non-voice input for creating a word 
list fails to meet the written description requirement of 35 U.S.C. §112, 1®' ^ because it 
constitutes new matter and is misdescriptive of Applicants' claimed system and program 
code. Applicants' Specification as originally filed does not in any way characterize the 
input to construct a context-enhanced database as being "randomly" received. Nor 
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would one having ordinary skill in the art characterize a received input from an e-mail as 
being "random". Random is a word that has many meanings in various contexts, but 
would not aptly describe the contents of an e-mail. An e-mail is written in a natural 
language (e.g., English), conforms to grammatical structures, consists of words rather 
than random letters, and common words have a higher probability of occurrence than 
uncommon words. An e-mail, or any natural language document, can be compressed 
by arithmetic coding, showing that the content is not truly random. Thus, Applicants' 
limitation of "randomly" receiving non-voice input fails to meet the written description 
requirement of 35 U.S.C. §1 12, 1®* H because it is new matter as not supported by 
Applicants' Specification, and because it misdescribes and mischaracterizes Applicants' 
claimed system and program code. 

Secondly, it is maintained that the combination of Young et al. and Thelen et al. 
('551) meets the limitation of "randomly" receiving non-voice input to create a context- 
enhanced database, within Applicants' definition of randomness. It is noted that 
Applicants do not limit the creation of a context-enhanced database to text received in 
an e-mail, although neither Young et al. nor Thelen et al. C551) discloses creating a 
context-enhanced database from e-mail. However, it is maintained that Applicants are 
claiming creating a context-enhanced database from randomly received information in a 
plurality of linked documents. Thelen et al. C551) discloses creating a vocabulary or 
language model from a set of documents in a document database of a distributed 
computer system, too. (Column 3, Line 20 to Column 4, Line 27; Column 6, Lines 1 1 to 
45) The precise content of the documents of Thelen et al. C551) can be described as 
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"random" because, under Applicants' definition, the totality of the content of the 
documents is not known in advance. The contents of the documents obey a 
probabilistic distribution, so even though the precise content of any of the documents is 
nondeterministic, the word distribution is probabilistic and random. Granted, Thelen et 
aL C551) discloses selecting the documents by a search criteria, still, the content of the 
documents selected by the search criteria is not known in advance, and, thus, is 
nondeterministic and random. The precise content of the vocabulary or language model 
created by the search criteria is a function all of the words in the documents, and not 
simply the search terms. 

Therefore, the rejections of claims 1 to 8, 1 1 to 18, 46 to 53, 56 to 66, and 69 to 
71 under 35 U.S.C. 112, first paragraph, as failing to comply with the written description 
requirement; of claims 1 to 8, 1 1 to 13, 15 to 17, 46 to 53, 56 to 58, 60 to 62, 64 to 66, 
and 69 to 71 under 35 U.S.C, 103(a) as being unpatentable over Young et al. in view of 
Thelen et al. C551)] of claims 14 and 59 under 35 U.S.C. 103(a) as being unpatentable 
over Young et al. in view of Ttielen et al. C551), and further in view of Mitchell et al. ] and 
of claims 18 and 63 under 35 U.S.C. 103(a) as being unpatentable over Young et al. in 
view of Thelen et aL ('551 ), and further in view of Baker et al., are proper. 

Conclusion 

The prior art made of record and not relied upon is considered pertinent to 
Applicants' disclosure. 

Ramaswamy et al. discloses related art. 
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Applicants' amendment necessitated the new grounds of rejection presented in 
this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP 
§ 706.07(a). Applicants are reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Martin Lerner whose telephone number is (571) 272- 
7608. The examiner can normally be reached on 8:30 AM to 6:00 PM Monday to 
Thursday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, David R. Hudspeth can be reached on (571) 272-7843. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 
273-8300. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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Martin Lerner 
Examiner 
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